Bayesian sample size estimation for logistic regression

نویسندگان

  • Anastasiya Motrenko
  • Vadim Strijov
  • Gerhard-Wilhelm Weber
چکیده

The paper1 is devoted to the logistic regression analysis [1], applied to classification problems in biomedicine. A group of patients is investigated as a sample set; each patient is described with a set of features, named as biomarkers and is classified into two classes. Since the patient measurement is expensive the problem is to reduce number of measured features in order to increase sample size. The responsive variable is assumed to follow a Bernoulli distribution. Also, parameters of the regression function are evaluated [2]. With given set of features, the model is excessively complex. The problem is to select a set of features of smaller size, that will classify patients effectively. In logistic regression features are usually selected by stepwise regression [3]. In the computational experiment, exhaustive search is implemented. This makes the experts sure that all possible combinations of the features were considered. The authors use the area under ROC curve [4] as the optimum criterion in the feature selection procedure. The problem of classification is associated with minimum sample size determination. In the paper, the following methods are discussed:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sample size determination for logistic regression

The problem of sample size estimation is important in medical applications, especially in cases of expensive measurements of immune biomarkers. This paper describes the problem of logistic regression analysis with the sample size determination algorithms, namely the methods of univariate statistics, logistics regression, cross-validation and Bayesian inference. The authors, treating the regr...

متن کامل

Bayesian and Iterative Maximum Likelihood Estimation of the Coefficients in Logistic Regression Analysis with Linked Data

This paper considers logistic regression analysis with linked data. It is shown that, in logistic regression analysis with linked data, a finite mixture of Bernoulli distributions can be used for modeling the response variables. We proposed an iterative maximum likelihood estimator for the regression coefficients that takes the matching probabilities into account. Next, the Bayesian counterpart...

متن کامل

Sample Size Bayesian Estimation for Logistic Regression

The problem of sample size estimation is important in the medical applications, especially in the cases of expensive measurements of immune biomarkers. The papers describes the problem of logistic regression analysis including model feature selection and includes the sample size determination algorithms, namely methods of univariate statistics, logistics regression, cross-validation and Bayesia...

متن کامل

Bayesian Sample Size Computing for Estimation of Binomial Proportions using p-tolerance with the Lowest Posterior Loss

This paper is devoted to computing the sample size of binomial distribution with Bayesian approach. The quadratic loss function is considered and three criterions are applied to obtain p-tolerance regions with the lowest posterior loss. These criterions are: average length, average coverage and worst outcome.

متن کامل

Bayesian Estimation of Change Point in Phase One Risk Adjusted Control Charts

Use of risk adjusted control charts for monitoring patients’ surgical outcomes is now popular.These charts are developed based on considering the patient’s pre-operation risks. Change point detection is a crucial problem in statistical process control (SPC).It helpsthe managers toanalyzeroot causes of out-of-control conditions more effectively. Since the control chart signals do not necessarily...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012